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(57) A system tor creating a high resolution image 
from a sequence of lower resolution motion images pro- 
duces a mapping transformation lor each low resolution 
image to map pixels in each low resolution image into 
locations in the high resolution image. A combined point 



spread function (PSF) is computed (ot each pixel m each 
lower resolution image employing the mapping transfor- 
mations. The high resolution image is generated from 
the lower resolution images employing the combined 
PSF's by projection onto convex sets (POCS). 
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Description 
Field of Invention 



5 The present invention is related to the field of digital image processing, and more specifically to a technique tor 

obtaining a high-resolution stin image trom a sequence of motion images that are of lower resolution, suffering from 
blur and noise, and under-sampled over an aroitrary lattice. 

Background of The Inventi on 

The present invention relates to a method and apparatus for reconstructing s high-resolution still images from 
multiple motion images (i.e., images containing relative displacement) that are under-sampled over an arbitrary spa- 
iiotempora! lattice and may suffer from blur and noise degradation The lower-resolution, degraded, multiple motion 
images may be a series o! still images of a particular scene, acquired by an electronic still camera operating sequentially 

'5 m time, or may be Irames digitized from a video signal. 

Such images are usually undersampled over a sparse spatial lattice (i.e., sampling lattice) due !o the presence ol 
a color filler array that samples color channels over a sparse set of points, and/or due to interlacing. In addition, such 
images sufier trom blur and noise The blur may be due to sensor integration and/or relative motion between the scene 
and the camera, and/or nonzero aperture time, ana/or camera ootics sucn as defocused lenses. The imaging sensor. 

20 and ihe digitization and quantization processes introduce noise. We refer to images that suiter from one or more of 
these effects as lower iesoluiion images 

Given a multitude of lower resolution images, it is often desirable to create a still image that is of higher quality, 
for various purposes, such as creating a good quality hard copy print. The resulting still image should have a larger 
number of samples, preferably over a regular rectangular lattice that is denser than thai of the given images, to reduce 

25 aliasing eliects, and should be free of the effects ol blur and noise. 

In US 5,341,174. Xue et ai., describe a method, where the resulting image is obtained bv mapping samples from 
neighboring images onto a selected image, on the basis of relative motion information, in order to increase She number 
and density of its samples. This approach is limited to interlaced video and does not account for blur and noise deg- 
radations. If the data is degraded by blur and/or noise, it is used as it is. 

30 Blur due to the integration at the sensor is accounted for in producing a still image in methods discussed in M. 

Irani and S. Peieg, "Motion analysis for image enhancement: resolution, occlusion, and transparency," J. of Visual 
Comm. and image Representation. Vol. 4, pp. 324-335, December 1993.: S. Mann and R. Picard, "Virtual bellows: 
Constructing high-quality stills from video," in SEEE Int. Conf. image Proc., {Austin, TX), November 1994. , M. Irani and 
S. Peleg. "Improving resolution by image regislration. " CVGIP: Graphical Models and Image Processing, vol. 53. pp. 

35 231 -239. May 1991. These methods, however, do not account for the aperture time, and hence do not properly handle 
Ihe motion blur. Furthermore, they do not model, and hence do not account for, the noise degradation Consequently, 
the still images created by these methods may still sutler irom motion blur and noise degradation. Furthermore, these 
methods assume that ihe input lower resolution images are sampled over a regular, rectangular lattice, if the input 
images are obtained from an interlaced video, for instance, they should first be deinterlaceo (i.e., converted to pro- 

«o gressive images sampled over a regular rectangular lattice) prior io tne application of the method. Otherwise, the 
methods are limited io non-interlaced, progressive input images. 

The method of HIGH RESOLUTION reconstruction discussed in A. M. Tekalp et. a!., "High-resolution image re- 
construction trom lower-resoluiion image sequences and space-varying image restoration," in IEEE int. Conl. Acoust., 
Speech, and Signal Proc, (San Francisco, CA), vol. III. pp. 169-172, March 1992. have used a projections onto convex 

"s sets (POCS) based method that accounts for blur due to sensor integration and noise. It aoes not however account 
for motion blur, and is applied to non-interlaced, progressive input images only 

Sun[!mjjry_oMnyjjrUjon 

so One of ihe objects of this invention is to provide a method for addressing the effects of all above mentioned deg- 

radations, namely aliasing (due to spatial undersarnpimg over an arbitrary lattice), sensor blur (due to spatial integration 
at Ihe sensor, and temporal integration during the aperture time in the presence ot reiaiive scene-sensor motion), 
optica! blur (due to defocused lenses), and noise (sensor and quantization noise), in creating a high-quality still image 
Another one of the objects of this invention is to provide a complete solution to simultaneous modeling and removal 

55 ol the effects ot aliasing, olurnng due to sensor integration, optics, motion, and contamination due to noise It can 
therefore create a high resolution still image, or a sequence. Still another object is io reconstruct a high-resolution 
image Irom lower resolution images that are sampled over an arbitrary sampling lattice, by reducing the effects of 
aliasing, by way ol increasing the sampling density, and reducing tne eliects of blurring due to sensor integralion and 
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noise contamination. Yet another object ol the present invention is to reconstruct a high-resoiulion image from lower 
resolution images that are sampled over an arbitrary samonng lattice, by reducing the eflects of aliasing, by wav of 
increasing ihe sampling density, and reducing the eiiect oi blurring due so sensor inteqra'.ion. 

Another object of ihe invention is to provide a meihod thai can be utilized in a digital still image camera equipped 
with a high-resolution mode The high-resolution mode works by invoking a "burst" mode, m which successive imaqes 
with relative motion are rapidly acquired, and then processed according to the meihod using either m-camera hardware, 
or off-line soli ware/hardware processing capabilities, to produce a high resolution still image. Alternatively, successive 
images containing relative motion can be acquired using an ordinary electronic still camera. One other object of ihe 
invention is to provide a meihod thai can be utilized to process images that are acquired by a video camera Images 
are processed according to the present invention using either m-eamera hardware, or oil-line software/hardware 
processing capabilities, to produce a high resolution still image. The high resolution image is spatially sampled with 
higher density than is intrinsic to ihe coior filter array (CFA) detector and is non-interlaced. Such a camera is useful, 
for example, in a desktop video conference system in instances where transmission oi very high resolution still images 
of text, drawings, or pictures is desired. 

The objects are achieved according io the present invention by providing a system for creasing a high resolution 
image from a sequence of lower resolution motion images produces a mapping transformation for each low resolution 
image to map pixels in each low resolution image into locations in ihe high resolution image. A combined pomi spread 
function (PSF) is computed for each pixel in each lower resolution image employing tne mapping transformations. Tne 
nigh resolution image is generated from ihe lower resolution images employing the combined PSF's by projection onto 
convex sels (POCS). 

According io a first embodiment, the step of producing a mapping transformation includes: 

a. selecting one o! tne lower resolution images as a reference image; 

b. estimating a mapping transformation describing the relative motion at each pixel between the reference lower 
resolution image and each other lowei resolution image, 

c testing ihe validity oi ihe estimated mapping transformation for each other lower resolution image pixel ana 
Slagging valid mapping transformations; and 

d scaling eacn valid mapping transformation from ihe lower resolution images to ihe high resolution image. 
Advantageously, the step ol computing a combined PSF includes 

a. calculating an effective sampling aperture relative to ihe nigh resolution image for each pixel in each lower 
resolution image employing the mapping transformations; 
0 calculating PSF's for ihe effective sampling apertures; 

c. delmmg an optica! PSF; and 

d combining ihe calculated PSF for each pixel with the optica! PSF io produce the combined PSF for each pixel. 
As an example, ihe step of generating the high resolution image by POCS includes: 

a interpolating one ol ihe lower resolution images io the number of pixels in ihe high resolution image to produce 
an estimate of ihe high resolution image and 

b lor each pixel in each iow resolution image having a valid mapping transformation, retinmg the estimate oi She 
high resolution image by 

i selecting a pixel m one of ihe lower resolution images. 

ii. producing a calculated pixel value from the hign resolution image by applying the combined PSF for ihe 
seiecied pixel io ihe current estimate of the high resolution image, and 

in. forming ihe difference oeiween ihe selected pixel value and the calculated pixel value and if Ihe magnitude 
of the difference is grealer than a predetermined threshold, back projecting ihe error into the current estimate 
ol the high resolution image; 

c. clipping ihe pixel values of the refined estimate of ihe high resolution image to an allowable range: and. 

d. repeating sieps b and c until a stopping criterion is satisfied. 

Advantageously, a high resolution video sequence from a sequence of lower resolution video images, comprising 
applying ihe method of the invention a plurality of times io She sequence of lower resolution video images to produce 
a video sequence of high resolution images. 

The present invention has ihe advantage ihat it is capable of processing images thai are sampled over an aroitrary 
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Figure 4 is a diagram illustrating a sequence of lower resolution images and a selected region of Ir 
one ol Ihe lower resolution images; 



Figi 



id 8 are diagrams useful In describing the method o! producing a mapping transformation according 
block diagram illustrating the image formation model used for calculating the combined PSF according 



present inven 

;8 isadiagrar 
ition images in 



Figure 9 is a diagram useful in describing the PSF for the case of transiational motion, 

Figure 10 is a diagram depicting the procedure for computing the LSI blur function ^{Xf.^UQ; 

Figure 11 is a diagram useful in describing ihe segmentation of h^ix-j-x. bk ,x 2 -X2 bk } Into regions; 

Figure 12 is a diagram useful for describing the approximation used for affine and perspective motion models in 
computing the combined effective blur PSF; and 

Figure 13. is a diagram useful in describing the POCS-based reconstruction method. 

Detailed Description of invention 

The major steps of the cresent invention are oeoicted in the flow diagram shown in Fig. 1 . wnere a number of lower 
resolution motion images lO are presented as tne input to the image processing methoa. One ot me input images is 
selected, by a user, trom the input set of lower resolution images 10. 1ms image is reterred to as the reference image 
ano it is mat image whose hign resolution version will be reconstructed. The reference image is specified by its time 

i he present invention has three major processing steps, in referring to Fiq. 1 . the first processma step 1 z provides 
mapping transformations that mac the pixels in each lower resolution input image into locations in tne nigh resolution 

o'.her lower resolution images relative to the lower resolution reference image A morion estimation methoa. such as 
the -weil-known hierarchical oiock maichina method with fractional pixel accuracy can be used. Alternatively, the morion 
estimation methoa disclosed in u b patent No. 5,241 ,608 issued Auq. 31.1 99a to i-ogel may oe employed to estimate 
the motion vector tie id. i he oiock maichmo method is based on a locally iransiationai motion model. Anoiner possiomtv 
is so use affine motion models, as will be descnoed later on, mat model zoom, rotation and snear in adaition to irans- 

The second processing step 14 uses the mapping transformation iniormanon. made avanaoie ov tne preceding 
si go 12, aperture time, sensor geometry, optical blur pom! spread t unction (PSF). and She nigh resolution sampling 
geomtry (MR; iocomouie the combined oiur PSF that accounts lor tne motion and optical Diur. and olurdue to integration 



4 



EP 0 731 800 A2 



at the sensor. Computation of the combined biur PSF is based on an image formation model that is described below. 

The high resolution image is created in the third step 16 where the combined blur PSF, the motion information 
Horn the mapping transformations, ana the given lower resolution images 10 are used in a method based on POCS. 
described in detail in the article by M.I. Sezan. "An overview of convex projections theory and its applications to image 

s recovery problems." Uitramicroscopy, no. 40. pp. 55-67, 1992. The nigh resolution image reconstructed at this final 
step is an estimate of the high resolution version o! the reference image that has a larger numoer of samples over a 
denser, regular rectangular sampling geometry (HR), regardless of the sampling lattice pattern of the input lower res- 
olution images, and is free from Diur ana noise degradation. Finally, the high resolution image is displayed 18 on a 
display device such as a CRT or a printer. 

'o The lower resolution images may be sampled with different sampling patterns at different, limes When this happens, 

a sampling lattice describes the periodic change in the sampling patterns. Typical sampling lattice patterns lor lower 
resolution, and the high resolution image are depicted in Fig. 2. Pattern (a) shows a diamond -shaped lattice, Pattern 
ic) is an interlaced lattice, and Pattern (b) and (d) show the denser lattice over which tne high resolution image is 
reconstructed. The open circles in the diagram denote the new samples generated oy the high resoiulion reconstruction 

!S process, and the solid circles denote the sampling pattern in the lower resolution image. Note that Fig. 2 shows a 2x 
increase in the effective sampling density. The method of the present invention allows for higher factors of increase, 
if desired Note also that one can generate a high resolution sequence by processing not one out all of the lower 
resolution images in a sequential manner, each time designating one oi the images as the reference image. Such a 
process would be useful for example to produce high resolution video. 

20 Reterrmg to Fig 3, a system useful in practicing the present invention is shown. Input devices such as a video 

camcorder/VCR 20 connected to a digitizer 22, a digital still camera 24, a digital video camcorder 26, a digital scanner 
28 or a disc storage 30 provide a source of a motion sequence of digital images. The motion sequence oi diqital images 
are supplied to an rnnaqe processing computer system generally 32. The image processing computer system includes 
a computer 34 such as a Power PC, a CRT display 36 having typically SVGA or better resolution, and an operator 

25 mput device such as a keyboard 38 or a mouse. Tne computer 34 is connected to an output device sucn as a printer 
40 for creating a hard copy display of the high resolution image; a storage medium 42 such as an optical disc for 
storage pending eventual display of the image; or link to a communication network 44 for distributing the high resolution 
image for remote display. 

Once the multiple low resolution images are available to the computer system 32 for display on CRT 36, it is also 
'M possible for a user to interactively specify a region of interest in the reference image and confine the resolution im- 
provement process to that region. Fig. 4 shows a sequence of low resolution images 46, 48 and 50. where region ol 
interest 52 in image 48 has been identified for high resolution processing. In this case, the high resolution version of 
the selected region is reconstructed over a high resolution sampling geometry, and the result is then down -sampled 
over the lattice of the lower resolution image and then placed into the region of interest replacing the values ol the 
35 original pixels. In the depiction in Fig. 4, the lace of the person forms the region of interest 52; In the resulting image, 
the facial detail will be available at high resolution. The user may visually identify regions that correspond to the selected 
region of interest 52. Then the process is applied only to these regions rather than to the entire lower resolution images, 
resulting in computational savings. 



ii 1 1 on i c n c l rr <^d fa rr ' -i h o c cf ! It wor solution maqe 5 lo h " efnm c rr iq^ a I 'i q i M 
motion vector fields lor M lower resolunon images to provide a mapping iransiormations that mao lower resolution 
image pixels to the nigh resolution image sampling locations. Tnis is grapnically depicted in Fig 5. In the simplest case, 
tne motion from the lower lesoluiion image 4o to tne refeience image 48 can be modeled as a spatially unnorm trans- 
lation. In practice, however, we have found this model to oe sub-ootimai. Tne hierarchical diock matching method, to 
estimate non-unuorm transiationai motion, and methods oased on aftine models and estimators are more effectively 

The iower resolution images 46. 48. 50, 53 are first bilmearly interpolated over a rectangular lower resolution lattice, 
for the purpose oi motion estimation, unless tney are already availaoie over a rectangular lattice. For example, a 
diamond-shaped lower resolution mout lattice 54 ano the corresponding tower resolution rectangular lattice 56 are 
depicted in Fig. 6. Tne interpolated values of the reference image are only used for motion estimation ana subseouentlv 
discarded and replaced with tne estimates during the POCs-oased hign resolution reconstruction process. A motion 
vector is estimated tor each actual pixel of the lower resolution images, resulting in M-i motion vector field estimates 

In the case of a biocK matching metnod of motion estimation, the motion is assumea to be locally transiationai. 
When other transformation eifects are smaii. tnis approximation can be quite effective 1 ne hierarchical bloc* matching 
method thdM) discussed in M.Bierlina ■Displacement estimation by hierarchical block matching." in Proc. SPIE Visual 
communications ana image Processing b8, pp. 942-951, 1988 is used to estimate tne non-unuorm motion field. The 
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matching criterion used is the mean absolute difference (MAD) between measurement blocks. At each level in the 
hierarchy, a logarithmic type search is used. 

The preferred parameter values that can be used in the implementation of a 5-ievei HBM are furnished in Table 
1 , where the first column is the hierarchy level number, and level 1 is the lowest resolution level 



Table 1 
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The maximum horizontal/vertical displacement (Max Disp. nor./vert.) is the displacement used in the firsl step o! 
the iogarithmic search. The horizontal/vertical measurement window size (Window Size hor./vert.) is the size of the 

20 window over which the MAD is computed. The horizontal/vertical filter size (Filter Size hor./vert.) specifies the support 
o! a Gaussian filter, with variance set to 1/2 of the support size. The step size is the horizontal and vertical distance 
between neighboring pixels in the reference image for which an estimate o! the motion is computed, the subsamplmg 
factor (SSF) is the horizontal and vertical subsamplmg used when computing the MAD over the measurement window 
and the accuracy of estimation is in terms of the sampling period oi the lower resolution rectangular lattice Note that 

25 all units for the parameters are relative to the spatial sampling period oi the lower resolution rectangular lattice (i.e., 
refinement lo 1/4 - pixel accuracy, relative to the lower resolution rectangular lattice, is performed in the final level of 
HBM). 

Significant non-transiational mapping transformations (resulting from rotations, shears and zooms) cannot be ac- 
curately modeled using the block matching techniques described above. It is preferable to model inter-image motion 
30 resulting in such mapping translormations using a global affine transformation defined by the parameters c,-c 6 m the 

The technique that can be used to estimate the parameters c, , c 2 , c 6 is described in J.Bergen, P.Burt, R.Hingorani, 
3S and S Peieg, "A three-frame algorithm for estimating two-component image motion," IEEE Trans. Pattern Anal. Intel., 
vol 14, pp 886-896, September 1992. This estimation method requires spatial and temporal derivatives to be esti- 
mated. The spatial derivatives are estimated using a 2-D second-order polynomial least-squares fit over a 5x5 window 
centered a! each pixel, while the temporal derivatives are computed using a 2-pomt finite forward difference at each 
pixel. Prior 10 estimating these derivatives, the images are blurred using an 11x11 pixel uniform blur to reduce the 
"° effects of noise. 

In case of color imagery, motion is estimated in the luminance domain. The same motion information is then used 
in separately processing the primary color channels (e.g., red, green, and blue) of the given lower resolution images. 
Therefore, an RGB to luminance and two chroma (e.g., YUV) transformation is applied to the lower resolution images 
prior so motion estimation tor forming the mapping translormations 

B Modeling and Computing The Combined PSF 

this model that is then used in computing the combined PSF. 

50 We first describe a model thai relates the given lower resolution images, to the actual high resolution image, at a 

particular reference time t r , through a continuous linear shift variant (LSV) blurring relationship. Towards this end, first, 
an image formation model is described The mapping transformations are then incorporated into the formation model, 
resulting in the desired LSV relationship expressed in terms oi a combined blur PSF Next, a discretization is presented 
to relate a discrete version ot the high resolution image to the observed lower resolution imagery, by means of a 

66 corresponding discrete LSV relationship, expressed in terms of a discrete combined blur PSF. Finally, a practical method 
is provided to compute the combined PSF for subsequent reconstruction of the high resolution image 
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Image Formation Model 

The image formation model we use is depicted in Fig. 7. in the figure, ihe input signal l(x,,x 2 ,t) denoies the actual 
high resolution imagery in the continuous domain, whose discrete estimate is desired. The eflects oi the physical 
dimensions ot ihe lower resolution sensor, i.e., blur due to integration over the sensor area, and the blur of the optical 
system are modeled in the first stage 60 oi ihe figure. The high resolution imagery f(x 1t x ? ,t) is convolved with both the 
kernel representing the shape of the sensor h a (x, ,x 2 ,t) , and the optical blur kernel h 0 (x,,x 2 ,t) These are both functions 
of time, bui we restrict them to be constant over the aperture time. The optical blur and aperture dimensions are thus 
allowed to differ from image to image. 

The effect of aperture time is modeled in the second stage 62 of Fig. 7 by a time-domain integrator whose output 
is given by 

£2 (*! . *2 > 0 = y\'t-T a 8\ ( x \ ,X 2 .r)dz, (2) 

where T a denotes the sensor aperture time. Note that ihe first two stages 60 and 62 commute, since the first is spatially 
linear shift-invariant (LSI) and the second is temporally LSI 

The third stage 64 in Fig. 7 models low resolution sampling using the arbitrary space-time lattice A E The output 
of this stage is denoted by g 2 (m 1 ,m 2 ,k). As a matter of convention, integer values m,, m 2 , and k , that appear as a 
function argument, are interpreted as in 

i 



where V s denoies ihe rr 
modeling step 66, addith 

including Motion 



We now incorporate a motion mode! into the image formation model to establish ihe desired LSV relationship 
between the lower resolution imagery and the desired high resolution image at a fixed but arbitrary time instance t r . 
By appropriately setting the vaiue(s) of t r , a single still high resolution image, or high resolution video images, comprised 
ot a sequence oi high resolution images, can be reconstructed. 

When a motion mode! is applied to the image formation model, the first two stages 60 and 62 in Fig. 7 can be 
combined to form a single LSV relation. We begin by considering motion as in 

i;x o !{M:x t !,; ;,; «x,,g, (4) 

where x denotes (x, ,x 2 ), and M(x,t,t r ) is a mapping transformation relating the position of an intensity at position x and 
time t, to its position at time t r This equation expresses the well-known assumption of intensity conservation along 
motion trajectories. By ieliing h 1 (x,t)=h a (x,l)**h 0 (x,t), the output of ihe first modeling stage can be expressed as 



gi(x,t) = jhi(x-z)f(Z,t)dX 

e change of variables x, = M(X,t.l} and using (4), (5) becomes 



8lV 



,s) — J A] (a: - M~ i [x lr ,t,lrf}f{x, r ,i r }j(M)\~ l dx tr 



where. M" 1 denotes the inverse transformation, J(M) denoies the Jacobean of M, and I.I denotes the delerminar 
operator. It is evident from (6) that the first stage of the model has been transformed into an LSV operation, acting o 
a high resolution image at lime t r To reflect this fact, we let 



denote the combined LSV blur point spread function iPSF) modeling the effect of ihe sensor geometry, optical blur, 
and relative motion. The effect of this equation is depicted in Fig. 8, where the picture at the left depicts the imaging 
process at time t. where ihe aperture 68 of a sensor element is imposed on the picture The picture to ihe nqnt shows 
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the equivalent imaging process at time t r . Notice the mapping transformation applied to the aperture 68 in going from 
time t to t, is the inverse of the mapping transformation applied to the image 70. Rewriting (8) in LSV form yields 

8l (x, / ) = J A, (x;x lr ;t,t r )f{x, r , t r )dx, r 1 8 ) 

The second modeling stage can now be expressed as 

g 2 (x,l) = \',-T a jfti(x;x lr ;T,i r )f(x tr ,t r )dx, r dT (9) 
By changing the order ol the integrations, the above becomes 

ft (*./) = jn 2 (x;x, r j,i r )f(x Sr J r )dx lr ( 10) 

where, 

h 2 (x;x lr Jj r ) = J ',- Ta h\(x,x, r ; rj r )dr. ill) 

Thus the firs! two siages of the model have been combined into a single LSV system, acting on !he continuous 
high resolution image at time t r This allows us to write the observed lower resolution imagery in terms ol a continuous 
high resolution image at time t r , as 

g{m\.m 2 ,k) = }k 2 {mi,m 2 ,x lr ;k,t r )/{x lr ,i r )dx lr + v / K m i ,m 2 ,k). < 12) 

where h 2 (*) Is the effective LSV blurPSF, and the integer arguments m, , m 2 , and k have the same interpretation as in (3). 

It is desirabie to discretize the LSV blur relationship in (12), to relate the observed lower resolution images to a 
discrete version ol the actual high resolution image f(x,,x 2 .t r ). Thus, a discrete superposition summation ol the lorm 

g(mi ,m 2 ,k)= , n 2 J r )h tr (/i, , n 2 ;m h m 2 ,k) + v(w, , m 2 , k), ( 1 3 ) 

{n\,n 2 ) 

will now be formulated. We assume that the continuous imagery f(x,,x 2 ,t r ) is sampled on the 2-D lattice A, i.e. (n,,n 2 ) 
are integers that specify a point In A )f ), by a high resolution sensor, to form t(n,,n 2 ,t r ). By appropriately choosing t r and 
a v sampling of f(n,,n 2 ,t r ) can be formed over an arbitrary space-time lattice. 

An individual high resolution sensor element (giving rise to a single high resolution image pixel) is assumed to 
have physical dimensions which can be used as a unit cell U tf for the lattice A, r . Thus, the entire space of the focal 
plane is completely covered by the high resolution sensor. The term t/,,(n,,n 2 ) is used to denote the unit cell U, shifted 
to the location specified by (n 1 ,n 2 ). With this definition, and with the assumption that f(x v x 2 ,t,) is approximately constant 
over (4(0,,^), (12) can be written as 

g(m lt m 2 ,k) = ]T /("l."2.'r) J jft 2 (m u m2;x (r ;k,t r )dx lr +u(nn,m 2 ,k). (14) 
("l."2) U lf . (fii,«2) 

By comparing (13) with (14), it is evident that 

ht r (n\n 2 ;mim 2 ,k) = J jh2(m[m 2 ,x lr - y k,t r )dx, r , (15) 
% («1."2) 
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A picionai exampie ot the ciiscreie LSV PSF lormulation, with a rectangular high resolution lattice A, r , is provided 
in Fig. 9. In the tigure, it is assumed that the motion is purely translations!, that a square lower resolution sensor aperture 
68 centered on a point (m-,, m 2 ) 73 is used, and that there is no optical blur The (x v x 2 ) space is the sensor focal plane 
at time t r . The local plane is shown covered by shifted nigh resolution sampling unit cells l/, r (n.,n 2 ) 69. The region of 
the focal plane "swepf by the lower resolution sensor aperture 68 during the aperture time T a .is shown by dotted 
outline 71. The discrete LSV PSF specified in (15) is formed by computing the duration of time a given area of the 
lower resolution sensor 68 "dwelled" over a region U (l (n 1 ,n 2 ) 69, while translating from its posiion at the aperture 
opening time, to its position 73 at the aperture closing time Note thai the result indicated by (15) does no! specify a 
simple area oi overlap between the area 71 swept by sensor aperture 68 and the high resolution sampling regions U h 



Computation oi the Combined PSF 

A practical method for computing the blur function h, r (.) given by (15) is described Two cases will be treated to 

ered. To solve this second case, a general approximation is given, that leads to a blur computation method that is built 
on top of the method delineated for the translations! motion case. Relative to this approximation, we provide speciiic 
methods tor aftme and perspective transformation models of motion. 



For the case of translations! motion, we define piece wise constant velocity motion paths, affective during the k lh 
opening of the aperture (i.e. acquiring the k* lower resolution image at time t k ), as 

x h = M{xjJ r } = x-i-x bk +v k [t-{t k -T a )), (16) 

where the velocities v, k and v 2 k , where v,, = [v, k v 2 J', are assumed to be constant over the aperture time T 8 , (t|,-T a ) 
is the time of the k ih opening of the aperture, and X b denotes the relative initial position at the k th opening of the 
aperture. The quantity x 0k is a (unction oi the times t k and !.. if for the moment the optica! blur is ignored, then the PSF 



A,(x.,„ ;,.,,) 

Hi[x~x t Jj r ) = h2{x;x h ;!,t r ) 



and applying (7) and (11) 



h' 2 (xJJ r ) ----- ~~ J o ° K (x + x bk + v k r)dr 



If we now assume the aperture response is a 2-D "reef" function given by 

l" } AMt AM} , AM 2 AM 7 

* [ 0; e ise, 

then h' 2 can be computed graphically as depicted in Fig 10 The coordinate x+x bi sets the starting point 76 ot the hn 
78 shown in the figure, at time t-0 . The integral follows the line 78 to its endpoint 80 at t=T 8 , and the result is simp! 
the length oi the line segment 78 that intersects the aperture 68 

To further describe the blur h' 2 . consider the case where v, k J a > AM, and v 2 k 7 g > &M 2 The point spread functio. 
h'olx-jf^) (the shift is used for convenience) can then be segmented into regions within the (x,,x 2 ) plane, as shown r 
Fig. 11. In each of the 7 regions depicted in the figure, the value of h' 2 (x-x b .) is described by a linear equation in x 
and x 2 . For instance, in the region marked with the number 1 which is a parallelogram the value of h' 2 is constant. Ii 
the trapezoidal region marked with a 2, h' 2 is found using the equation 

#2 Ul -•*!.**» -*2-*2.*J= C 18 ) 
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ix\,x 2 ) 



io within Region 2. 

where K is a scaling constant that can be accounted for by normalizing the discrete PSF 
The discrete PSF h, r (15) is computed by integrating over the volume under the regio 

U, r {ni,n 2 ) 

69 shown in Fig. 11. The center 82 o! this region 69 is located at 

x x s (m u m 2 ,k} + x bk -x tr (n lt n 2 ) , 



is defined similarly to 

x s (m h m,k). 
ht r [*\ -x\,b k 'Xl-*2>b k ) 

is computed by finding the volume under 

The optical blur h 0 (x,t) can be subsequently taken into account using a discrete approximation by carrying out the 
convoiution 



h,U,n 2 ;m um2 ,k)* 



"S is the discrete representation oi the locus blur tor the k ih lower resolution image, and " n)J)2 denotes 2-D discrete 
convolution over the variables (n,,n 2 ). By taking the optica! blur into account in this way, we are making the assumption 
thai the blur PSF h v within a region about x s (m 1 ,m 2 ,k). is approximately LSI. This is a reasonable assumption as long 
as the image has no! undergone an extreme non-translational motion Handling the optical biur as in (20) is attractive, 
since h.,. can easily be computed when the optical biur is not considered, and the convoiution in (20) is easy io implement. 

so m a preferred implementation, the optical blur PSF is set equal to a Gaussian with unity variance and with 5x5 pixel 
support, expressed in terms of high resolution sampling lattice. 

2) General motion. 

55 We now extend this method for computing the blur to the case of more complex motions, such as those described 

by atfine or perspective transformations. The extension is based on the following observation: the transformation due 
to motion between the times i r and t k may be significant, however, the non-trans lational component o! the transform- 
antion that effects the biur shape will be small during the aperture time. This concept is demonstrated in Fig. 12. The 
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graphical representation of the computation described in (11), which is rewi 
h 2 {x,x lr JJ r ) = jh y (x ~ M~ ] {x !r , r.i r ))\j[\f{x, r , tj^ dc 



The figure depicts the transformed blur kernel h, (.) as it translates across the plane x. ( during the integration time, from 
t-T a 84 to i 86. The value of h 2 () is then the "dwell" time over x, weighted by the Jacobian and the amplitude of h ( (.). 
Computation of (21) is difficult since the translating kernel h, in (21) is continuously transforming during the integration 
io period. As previously pointed out, however, the non-translational component of this transformation, during the aperture 
time, is assumed to be small. This eft eel is demonstrated in Fig. 12 by showing the dotted outline of the function 

/?i(M~ s (x fr ,/-7 a> / r )) 

is 84 superimposed on 

h\(M~ l [x, r ,u r )} 

86. In terms of (21), the approximation makes the assumptions that: (i) the Jacobean weighting is a constant, (ii) the 



transiormaniion 



M[x, r ,T-T a ,t r ) 

this I unction only trans 
within the aperture time 

\j(M[M(x,t-T a ,t r lr,t r ))\~ l ^h- i (e{x,T)-M- l {x !r J-T a J^!dr (22) 



is maintained throughout the aperture time (i.e., this function only translates as x changes), and (iii) the path of trs 
lation during two consecutive frames, and thus within the aperture time, is linear. With this approximation, (21) car 



t{x,T) = ~---~x + -M~\M{x,t-T a +tJ r ),t-T a ,l r ) (23) 
T j. a 

and T is the time between consecutive frames. 

Using tnis approximation, the same procedure for computing the blur in the case of spatially uniform, temporally 
piece wise consiani-velocity translations! motion is used, except thai at each point x tne blur is computed with the 

io appropriate transformation applied to the rectangular function 68 depicted in Fig 10. To summarize, when the trans- 
formation is defined by uniform and constant translations, the approximation will result in an exaci blur computation. 
When the transformantion is aftine, the Jacobean does not vary with x lr . but we have approximated if to be constant 
over time, while the apenure is open. Additionally, the translation is assumed to oe constant-velocity, where this may 
no! necessarily the case, in the case of perspective motion, the approximation nas the same effects as in the alline 

"5 case, with the additional approximation that the Jacobean is constant over the spatial blur support of h,(.). 



C. Reconstruction of The High Resolution Image 



Given the combined blur PSF, i) tf the motion vector field estimates Irom lower resolution images to the reference 
so image, ancf the hign resolution sampling lattice, tne high resolution image is reconstructed using the following technique 
based on the method of POCS In POCS. the desired image distribution is assumed to be a member of a mathematical 
vector space, such as the P-dimensional vector space, for a IMl pixels by N2 lines (P = N1xN2) high resolution image. 
Tne method of POCS requires the definition of closed convex constraint sets within this vector space, thai contain the 
actual high resolution image. Mathematical intersection of these sets contain the actual high resolution image since il 
55 is contained in each one of them. An estimate of the actual high resolution image is then defined as a point in the 
intersection of these constraint sets, and is determined by successively projecting an arbitrary initial estimate onto the 

Associated with each constraint set is a projection operator P, mapping an arbitrary point within the space, but 
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utside the set, to the closest point within the set. Relaxed projection operators, T-(1 -\)I+\P.0<\<2, where I denote: 
>e identity operator, can also be defined and used in finding an estimate in she intersection set. 

We now describe how we utilize the principles o! POCS in developing a method to solve the high resolution re 
Destruction problem at hand. We define the following closed, convex constraint set, for each pixel within lower resa 
it ion image sequence g(m 1 ,m 2 ,k): 

C tr (m],m2,k) = jy(ni,n 2 ,t r ): |r ( ^ (m j ,m 2 , k)|< 5 0 J, (24) 



r {y> (m\,m2 k) = g(mi,m 2 ,k) - ^ y{n^n2J r )h ti _(n\,ri2\m\,m 2 ,k}, (25) 

n\,n 2 

is the residual associated with an arciirary member, y , ol the constrain! set. We refer to these sets as the data con- 
sistency constraint sets. The quantity 5 0 is an a prion bound reflecting the statistical confidence with wnich the actual 
image is a member of tne set C,(m A ,m 2 ,k). Since r('> (m.,m 2 .k)= v(m 1 ,m 2 ,k), where f denotes the actual high resolution 
image, the statistics o! r'^m, ,m 2 .k) are identical to those of v(m,,m 2 .k ). Hence the bound 5 C1 is determined from the 
statistics of the noise process so that the actual image (i.e., the ideal solution) is a member of the set within a certain 
statistical confidence. For example, if the noise has Gaussian distribution with standard deviation o. 8 0 is set equal to 
co, where ca is determined by an appropriate statistical confidence bound (e.g.. c=3 for 99% confidence). 

It is also possible, in practice, to directly adjust the value o! 8 0 . As 8 C increases, the reconstructed image becomes 

tests, setting 8 0 equal to 0 01 has resulted in fairly rapid convergence of tne POCS method to an image witn sufficiently 
gooa quality. 

Note that the sets C ,£m v m 2 , k) can be defined only those spatial locations where samples ot the lower resolution 
images are available. This enables the invention to be applicable to any arbitrary lower resolution sampling lattice. 
Further, the sets CJm^.m^k) can be defined only for those samples of the lower resolution images where there are 
no occlusions and uncovered regions. The latter fact makes the invention adaptive to changing scenes within a given 
motion imagery In short, constraint sets are defined only for appropriate samples of the lower resolution imagery. 

The projection z(n,,n 2 ,t r ) P.^m^.m^k) (x(n 1 ,n 2 ,t r )] of an arbitrary x(n 1 ,n 2 ,t r ) onto C lr (m,,m 2 ,k) can be defined as: 

P tf (m i ,m 2 ,k)[x(n h n 2 ,l r )} = (26) 

x(n l ,n 2 j r } + - — ,r {x) (m h m 2 ,k}>d 0 

> > hf i.o\,02\m\,m2,k) 
i-^o\ *—>o 2 'r 

x(n h n 2 J 2 ), -8 0 ^r ix \m h m 1 ,k)<S 0 

(r ( - x) (m h m 2 ,k) + d 0 )h { (n\,n2;mi,m 2 ,k) ( ,\ 
x(n\n 2 , i r ) + „ ^ r , r ,m 2 ,k) < ~ S a 



Additional constraints such as bounded energy, posit ivity, and limited support can be u! 
e use the amplitude constrain! set, 

C A = [y(n v n 2 .t l ):a<f(n,,n 2 ,t r )<N, 
th amplitude bounds of a=0 and b=255. The projection P A onto the amplitude constrain 
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I 255, 



x(n l ,n 2 J r )<0 
>i. n 2 ,i r )<2SS 
x(n i ,n 2 ,t r )>2t 



Givers the above projections, an estimate, ! {n^,n 2 ,t r ), o! the high resolution image f^.n^tj, is obtained iteratively 
(rom all lower resolution images gfm^rr^k), where constraint sets can be defined, as 

to 

f e+] (n h n 2 j r )=T A 7\ff(n l ,n 2 j r) ] f = 0,1,2,... (29) 

where f denotes a cascade of the relaxed projection operators, projecting onto the family of sets 
15 C; {m\jri2,k} , 

Any lower resolution image, bilinearly interpolated over the high resolution sampling lattice can be used as the Initial 

so /o(*i,«2.0- 

Choosing the lower resolution image with the best visual quality for initialization may increase the speed of reaching 
at an Iteration number i a! which a visually satisfactory high resolution image is reconstructed In theory, the iterations 
continue until an estimate lies within the intersection of all the constraint sets. In practice, however, iterations are 
generally terminated according to a certain stopping criterion, such as visual inspection the image quality, or when 
25 changes between successive estimates, as measured by some difference metric (i.e., 

\ft ft 

using the L 2 norm), fall below a preset threshold. 
30 A pictorial depiction of this method given in Fig. 13. The combined LSV blur relates a region 71 of the current 

high resolution image 88 estimate, say f t {.}, to a particular pixel intensity q(m 1 ,m 2l k) 90 in one of the lower resolution 
images 46, 48, 53. The residual term 

is then formed, which indicates whether or not the observation could have been formed from the current high resolution 
image estimate (within some error bound determined by 8 0 ), and therefore whether the high resolution estimate belongs 
to the data consistency set 

if it is no! in the set (i.e. the residual is too large), the projection operator 
P lr (m h m 2 ,k) 

back projects the residual onto the current high resolution image 88 estimate (the additive term in (26)), thus forming 
"5 a new estimate of the high resolution image that does belong to the set 

Cf (mi,m2,k) , 

and therefore could have given rise to She observation g(m.,m 2 ,k), within the bound 8 0 . Performing these projections 
over every lower resolution pixel 90 where a consistency constraint set is defined, completes the composite projection 

referred to in (29). Subsequent projection onto the amplitude constraint set completes a single iteration of the POCS 
method, resulting in the next estimate f M {.). 
$5 One possible implementation of the POCS based reconstruction method is as follows: 

1 . Choose the reference image, and thus the reference time t r . 
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2. Specify She high resolution lattice, and determine the ratio between the density of the high resolution lattice and 
the lower resolution rectangular lattice over which the image values are generated via bilinear interpolation tor 
motion estimation purposes. We refer to this ratio as r (For instance, r=2 in the examples given in Figs. 2 and 6.) 

3. Perform motion estimation: spatially bilinearly interpolate each lower resolution image g(m,,m 2 ,k) to lower res- 
olution rectangular lattice; estimate motion trom each interpolated lower resolution image, to the interpolated lower 
resolution image at !,.; scale estimated motion vectors by r. 

4. Define sets C lr {m;,m 2 ,ltj according to (24), lor each pixel site (m 1 ,m 2 ,k) where the motion path is valid. 

5. Compute the combined blur PSF 

h lr (n u n 2 ,mi,m 2 ,k) 
for every site (m 1 ,m 2 ,k) where the sets 

Q (mi,ni2,k) 

have been defined. 

6. Set 

f 0 {n\,n 2 J r ) 

equal to the lower resolution Image that has the best visual quality, after bilinearly interpolating it over the sampling 
lattice of the high resolution image. 

7 For all sites (m 1 ,m 2 ,k) where the sets 
have been defined: compute the residual 

r"^ '(???!, m 2 >*') 

according to (25); back-project the residua! 
using the projection 

P, r {m h m 2 ,k) 

in (26). 

8. Perform the amplitude projection P A using (28). 

9. If the stopping criterion is satisfied then stop, otherwise go to Step 7. 

Alter the stopping criteria is satisfied, the image may then be displayed, stored for future display, or transmitted 
for remote display. 

The invention has been described with reference to a preferred embodiment However, it will be appreciated thai 
variations and modifications can be effected by a person of ordinary skill in the art without departing from the scope 
of the invention. 

Parts List: 

10 lower resolution images 

1 2 provide mapping transformation step 
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14 compute combined PSF step 

16 create high resolution image step 

18 display high resolution image step 

20 video camcorder/VCR 

5 22 digitizer 

24 digitai still camera 

26 digital video camcorder 

28 digital scanner 

30 disc stroage 

io 32 image processing computer system 

34 computer 

36 CRT 

38 keyboard 

40 printer 

46, 48, 50 low resolution images 

52 region o! interest 

53 iow resolution imaae 

so 54 aiamono snapeo samplma lattice 

56 recianquiar samoimg lattice 

60 optical system blur model 

62 aperture time mode! 

64 iow resolution sampling model 

as 66 additive noise model 

68 sensor aperture 

69 high resolution samoiing reqions 

70 image 

71 area swept oy sensor aperture 
30 73 center oi sensor aperture 

76 starting point 

78 line 

80 end point 

82 center oi region 

35 84 transformed oiur Kernel 

86 transformed biur Kernel 

90 low resolution image pixel 



A method tor creating a high resolution image (com a sequence of lower resolution motion images (10), comprising 

a, producing a mapping transformation (12) for each lower resolution image (10) to map pixels in each lower 
resolution image into locations in the high resolution image, 

b computing (14) a combined point spread function (PSF) for each pixel in each lower resolution image (10) 
employing the mapping transformations; 

c. generating (16) the high resolution Image from the lower resolution images employing the combined blur 
PSF by projection onto convex sets (POCS); and 

d, displaying (18) the high resolution image 

The method claimed in claim 1. wherein the step of producing a mapping transformation includes: 

a. selecting one oi the lower resolution images as a reference Image; 

b. estimating a mapping transformation describing the relative motion at each pixel between the reference 
lower resolution image and each other lower resolution image; 
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c. testing Ihe validity of the estimated mapping translormation for each other lower resolution image pixel and 
flagging valid mapping transformations; and 

d. scaling each valid mapping transformation from the lower resolution images to the high resolution image. 
3. The method claimed in claim 1 , wherein the step of computing a combined PSF includes 



4, The method claimed in claim 1, wherein the step ol generating ihe high resolution image by POCS includes: 

a. interpolating one of the lower resolution images to the number of plxeis in the high resolution image to 
produce an estimate of the high resolution image; and 

b. for each pixel in each, low resolution image having a valid mapping transformation, refining the estimate of 
the high resolution image by, 

I, selecting a pixel In one of the lower resolution images, 

ii. producing a calculated pixel value from ihe high resolution image by applying ihe combined PSF for 
the selected pixel to the current estimate of the high resolution Image, and 

iii. forming the difference between the selected pixel value and the calculated pixel value and II ihe mag- 
nitude of the difference is greater than a predetermined threshold, back projecting the error into ihe current 
estimate of the high resolution image; 

c clipping the pixel values ol the refined estimate of the high resolution image to an allowable range, and. 
d. repeating steps b and c until a stopping criterion is satisfied. 

5. A method for producing a high resolution video sequence from a sequence of lower resolution video images, 
comprising applying the method claimed in claim 1 a plurality of times to ihe sequence ol lower resolution video 
images to produce a video sequence of high resolution images. 

S. Apparatus tor creating a high resolution image from a sequence of lower resolution motion images, comprising: 

a. a source (20, 22, 24 26, 28, 30) for producing a sequence of lower resolution motion images, 

b. an image processor (34, 36, 38) for receiving the sequence of lower resolution Images and creating the 
high resolution image, including: 

i. means for producing a mapping transformation for each low resolution image to map pixels in each low 
resolution image into locations in the high resolution image; 

ii. means for computing a combined point spread function (PSF) lor each pixei in each lower resolution 
image employing the mapping transformations; 

iii means tor generating ihe high resolution image from the lower resolution images employing ihe blur 
PSF's by projection onlo convex sets (POCS); and 

c. a display device (40) for displaying ihe high resolution image. 



a. calculating an effective sampling aperture relative to the high resolution image 
resolution image employing the mapping transformations; 

b. calculating PSF's for the effective sampling apertures; 

c. defining an optical PSF; and 

d. combining the calculated PSF for each pixel with the optical PSF to produce the c 
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